ASAP: Asynchronous Approximate Data-Parallel Computation

نویسندگان

  • Asim Kadav
  • Erik Kruus
چکیده

Emerging workloads, such as graph processing and machine learning are approximate because of the scale of data involved and the stochastic nature of the underlying algorithms. These algorithms are often distributed over multiple machines using bulk-synchronous processing (BSP) or other synchronous processing paradigms such as map-reduce. However, data parallel processing primitives such as repeated barrier and reduce operations introduce high synchronization overheads. Hence, many existing data-processing platforms use asynchrony and staleness to improve data-parallel job performance. Often, these systems simply change the synchronous communication to asynchronous between the worker nodes in the cluster. This improves the throughput of data processing but results in poor accuracy of the final output since different workers may progress at different speeds and process inconsistent intermediate outputs. In this paper, we present ASAP, a model that provides asynchronous and approximate processing semantics for data-parallel computation. ASAP provides finegrained worker synchronization using NOTIFY-ACK semantics that allows independent workers to run asynchronously. ASAP also provides stochastic reduce that provides approximate but guaranteed convergence to the same result as an aggregated all-reduce. In our results, we show that ASAP can reduce synchronization costs and provides 2-10X speedups in convergence and up to 10X savings in network costs for distributed machine learning applications and provides strong convergence guarantees.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asynchronous Distributed Estimation of Topic Models for Document Analysis

Given the prevalence of large data sets and the availability of inexpensive parallel computing hardware, there is significant motivation to explore distributed implementations of statistical learning algorithms. In this paper, we present a distributed learning framework for Latent Dirichlet Allocation (LDA), a well-known Bayesian latent variable model for sparse matrices of count data. In the p...

متن کامل

Implementation of Parallel Arithmetic in a Cellular Automaton

We describe an approach to parallel computation using particle propagation and collisions in a one-dimensional cellular automaton using a particle model | a Particle Machine (PM). Such a machine has the parallelism, structural regularity, and local connectivity of systolic arrays, but is general and programmable. It contains no explicit multipliers, adders, or other xed arithmetic operations; t...

متن کامل

Analog VLSI arrays for morphological image processing

A two-dimensional analog VLSI array that performs basic morphological image processing operations is presented. The system uses a smart pixel approach that facilitates the parallel computation of continuous real-time outputs. Photodetectors within the array of smart pixels also allow for parallel optical inputs. The processing is performed by current-mode circuitry implemented with CMOS technol...

متن کامل

Context-Aware Process Networks

In industry, embedded systems for stream-based processing are often modelled and verified by using process networks, such as Kahn process networks. An advantage of Kahn networks is that they allow asynchronous operation of process components in a network. A problem in these networks, however, is that asynchronously interfering events cannot be handled properly because they are intrinsically ind...

متن کامل

A massively parallel implementation of the watershed based on cellular automata

The watershed transform is a very powerful segmentation tool which comes directly f rom the idea of watershed line in geohydrology. It has proved its e f ic iency in many computer vision application $fields. Th i s paper presents a new implementation of the watershed which is optimal according to computation, t ime. The flooding algorithm is reminded. Then , a massively parallel cellular automa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1612.08608  شماره 

صفحات  -

تاریخ انتشار 2016